Noise in Naming Games, partial synchronization and community detection in social 

networks 
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The Naming Games (NG) are agent-based models for agreement dynamics, peer pressure and 
herding in social networks, and protocol selection in autonomous ad-hoc sensor networks. By intro- 
ducing a small noise term to the NG, the resulting Markov Chain model called Noisy Naming Games 
(NNG) are ergodic, in which all partial consensus states are recurrent. By using Gibbs-Markov equiv- 
alence we show how to get the NNG's stationary distribution in terms of the local specification of 
a related Markov Random Field (MRF). By ordering the partially-synchronized states according to 
their Gibbs energy, taken here to be a good measure of social tension, this method offers an enhanced 
method for community-detection in social interaction data. We show how the lowest Gibbs energy 
multi-name states separate and display the hidden community structures within a social network. 
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I. INTRODUCTION 

The possibility of using the Naming Game (NG)||[ll|], 
a family of agent-based models, for modeling agreement 
dynamics, leader election and the effects of peer pres- 
sure in social networks 0, H, protocol and security 
key selection in encrypted communication in autonomous 
sensor networks [10], vocabulary selection in linguistics 
p^ . |22| and the role of herding in market crashes in finan- 
cial networks have recently attracted great interest in the 
scientific community. Formation of opinions and agree- 
ment dynamics in complex financial and social networks 
are relevant examples of collective / cooperative behav- 
iors that are driven by the individual's innate propensity 
to imitate and hence herd in the absence of reliable infor- 
mation or when there is information asymmetry. Finite- 
size effects and the underlying network's topology such 
as scale- free 0] and small world Q, have significant im- 
pact on the expected time to consensus (single-name or 
total-synchrony state) and other related critical expo- 
nents. Indeed simulations showed departure from the 
values obtained by classical scaling arguments based on 
the e merg ence of the giant cluster in percolation theory 
[E H3j l2l| and on the dynamical theory of coarsening in 
high-q Pott's models [ll QUI IMS, highlighting the 
need for more simulation studies and also Markov Chain 
analysis Q- 

A recent application of the Naming Games is to find 
hidden communities in social networks [ill E3 > [HI ■ The 
main objectives of this paper are to sharpen the commu- 
nity detection capability of the original Naming Games 
by the explicit and purposeful addition of rare noise 
events, and to analyze rigorously the consequences of 
adding this noise. Community structure is extremely 
difficult to define and uncover and remains one of the 
outstanding open problems in network science. Amongst 
existing methods, the most powerful algorithms are based 
on the notion of modularity |32| which provides an order- 
ing of the closeness between the given network's commu- 



nity structure and a large family of motifs or partitions. 
The optimization problem involved in finding the motif 
with the largest modularity grows rapidly with the size 
of the network. While several working heuristic methods 
have been proposed, they are not better than a spectral 
method based on the eigenvectors of a Laplacian-like ma- 
trix [32| . This procedure of splitting the network into two 
parts is repeated in a binary tree way until the subgraphs 
are indivisible. 

In particular, direct simulated- annealing of the modu- 
larity optimization problem will be computationally ex- 
pensive, just as other Monte-Carlo methods applied to 
the original Naming Games have yielded good results 
only after costly long simulations [l8|]. In this article 
we indicate how Monte-Carlo methods and other impor- 
tance sampling algorithms can be numerically efficient on 
the Noisy Naming Games but not on the original Naming 
Games. Comparing the Noisy Naming Game method to 
the modularity optimization procedure, it is worthwhile 
noting that these two methods start from opposite ends 
of the community structure: the modularity - binary tree 
method finds first, the optimal division of the whole net- 
work into two parts but the Noisy Naming Games, begin- 
ning from random initial sublists of the allowed words, 
first finds the (last) optimal division into small communi- 
ties. The modularity increases as the binary tree method 
proceeds towards the indivisible units but the Gibbs po- 
tential decreases as the Noisy Naming Game proceeds 
towards the low-lying states near to its ground state of 
total consensus. In this way, the Noisy Naming Games 
are complementary to the modularity based spectral bi- 
nary tree method. If one is looking for the finer divisions 
of the network into small communities then the Noisy 
Naming Games will find them provably faster than other 
methods including the modularity-based methods. 

But before the Naming Games can provide an applica- 
ble procedure for community detection, three problems 
need to be addressed. The first is the transient nature of 
its multi-name partial consensus states that correspond 
to possible community structures. The second concerns 
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the existence and computability of a stationary measure 
that can be used - like the notion of modularity - to rank 
the community test-patterns in the form of multi-name 
partial consensus states. The third issue concerns the 
computational efficiency of the Naming Games viewed 
herein as an inverse method for finding communities in 
networks. It turns out that these problems can be over- 
come by adding rare noise events to the standard stochas- 
tic framework provided by the Markov chain model of 
the original Naming games. Unlike the original Naming 
Games which have been compared to q-Potts models [24j , 
the Noisy version introduced here is more closely related 
to sp in-g lass models with an additional layer of random- 
ness [35(. As is well-known certain aspects of informa- 
tion processing and combinatorial optimization (such as 
the graph-partitioning problem that is closely related to 
community detection) can be addressed from the vantage 
point of the spin-glass framework. In many instances, 
the ground-state of the spin-glass model provides infor- 
mation for the optimization problem. But computing or 
finding the ground-state is often an NP-hard problem, 
of an equivalent degree of difficulty as the optimization 
problem itself. The spin-glass framework offers the alter- 
native path — often seen to work well in explicit posi- 
tive usage of noise to solve hard deterministic problems 
- of exploiting not only the inherent spin entropy of the 
model (when the interactions are quenched) but also the 
additional randomness and corresponding entropy arising 
from the ensemble of quenched interactions for which the 
replica method 35} is designed to average over. We ex- 
ploit these relationships in discussing each of these three 
issues in formulating an efficient procedure based on the 
Noisy Naming Games for finding community structures. 

First, because the total consensus states of the orig- 
inal Naming Games are the only absorbing states, the 
detection of communities or partial consensus states is 
complicated by their transient nature. Meta-stable states 
representing multi-names or partial consensus configura- 
tions arise in the original Naming Games on networks 
with strong community-structure but have been shown 
numerically to evolve into other partial-synchrony states 
after an initial phase 0, [TtJ , according to a time hierar- 
chy in general. Complete agreement on leader selection 
or total consensus occurs on time-scales that are rela- 
tively short after a long meta-stable stage, with negative 
consequences on the design of algorithms for detection 
of hidden communities, for example, in social networks 
with Small World topologies [2{|, [3l[. Related results 
on the synchronization of dynamical networks and dis- 
crete time maps on networks that highlight the roles of 
the network top olog y and the coupling weights have been 
discovered [26| , 130] . In other significant contexts such as 
the modeling of brain functions by coupled oscillators on 
a network, the transient nature of partially synchronous 
states play significant positive roles. 

The first issue that has to be solved is to find a version 
of the Naming Games where all partial consensus states 
are recurrent. This can be done if we come up with a ver- 



sion of the Naming Games which is an ergodic Markov 
chain. A nice property of ergodic Markov chains is the 
existence and uniqueness of an invariant measure with 
positive weights on all the recurrent states. Then the 
next issue in designing an efficient procedure for commu- 
nity detection starting from the original Naming Games 
is finding a closed-form expression for this invariant mea- 
sure and proving that given the network topology (such 
as neighborhood structure) and a reasonably small set of 
allowed words in the original game, this invariant mea- 
sure ranks the partial consensus states in an averaged 
sense (macrostates) incorporating all entropic contribu- 
tions in the Noisy Naming Games. One key property of 
this invariant measure is the free energy gaps between 
the low-lying states near the ground-state - larger gaps 
lead to better resolution of the community structure. We 
show by exact calculations on small cliques that the in- 
variant measure of the Noisy Naming Game inherits this 
property from its closed-form Gibbs potential. Although 
the large-scale implementation of the Monte-Carlo sim- 
ulation of the Noisy Naming Game on a real world so- 
cial network remains to be done, our small-scale testing 
of this method on computer-generated randomized net- 
works of about 60 nodes allow us to infer that the method 
is computationally efficient. In the context herein, rig- 
orous results stating new conditions for the existence 
and rank-ordering of recurrent multi-names or partial- 
consensus states in the Naming Games are useful in view 
of potential applications to detecting hidden communi- 
ties in interaction data such as Twitter and Facebook. 

In this article, we show that a class of arbitrarily small 
noisy perturbations of the Naming Games, called Noisy 
Naming Games (NNG), satisfies these conditions. The 
main results here are analytical ones: First, the Noisy 
Naming Games differing from the NG by arbitrarily small 
stochastic perturbation, has a unique invariant distribu- 
tion with positive weights for partial consensus states. 
Second, we construct the NNG's invariant distribution as 
the Gibbs potential of a related Markov Random Field 
(NNGGS) called the Noisy Naming Game with saturated 
training from its local specification. Third, we show the 
Gibbs energy (which is taken to be a reasonably efficient 
measure of social tension in this article) provides an or- 
dering of the recurrent multi-name or partial-consensus 
states of the NNG whereby the lowest energy and thus 
most probable ones, also have single-name or single-color 
cliques. This last result has significant impact on the de- 
tection of (hidden) community structure in a social net- 
work by Monte-Carlo simulations. After the initial equili- 
bration process, the fraction of time spent by the Monte- 
Carlo simulator in any given partial-consensus state is 
proportional to the probability it is assigned by the Gibbs 
invariant measure. Thus, besides identifying the low- 
lying Gibbs energy states which are the most indicative of 
underlying community structure - near-cliques or tightly 
connected clusters have a common single-name or color - 
the simulation when run long enough will also provide an 
ordering of these states. We will show with explicit calcu- 
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lations on simple examples, that the exact Gibbs energy 
of the multi-name states not only orders them but have 
the essential property of significant energy gaps between 
the low-lying states. 

In brief, the introduction of additional randomness 
in the form of rare noisy events to the original Nam- 
ing Games provides the key to successfully enhancing 
community-detection through the existence and rank- 
ordering of a large family of stable partial-consensus 
states where tightly-connected clusters in the network 
show up as single-named or single-colored subgraphs 
in the social network. Moreover, these partial consen- 
sus states can be found and ranked by simulating the 
Noisy Naming Games using an equivalent Gibbs sampler 
method which was used to produce the figures in the 
last section. In contrast to the generic need to avoid or 
prevent noise in most technological systems, the precise 
and explicit use of noise to achieve a positive aim in this 
project has few precedents. Significant examples are the 
classical Parrondo games and the Brownian Ratchet [25[ . 



II. 



MARKOV CHAIN MODEL 



We construct a Markov chain model for the NG [5[ 
where a transition consists of the change of local state 
at only one site in the network and the set Name of all 
allowed words is fixed at the outset. It is easier to an- 
alyze by the Gibbs Sampler method than the usual NG 
to which it is equivalent. Consider a network based on a 
connected graph containing N sites S — {si}. The neigh- 
borhood structure of the network is given by {A/" Si } Sie s- 
Each site will have a word list chosen from the finite 
set Name — {A, B, C...}. Set T = {7fe}, which consists 
of all non-empty subsets of Name, represents all pos- 
sible "word lists" of a site. The configuration function 
X(si) : S — > r mapping each site to its word list gives 
the local state of s j . A configuration restricted to a subset 
of sites £ C S is denoted by X(H). Therefore the net- 
work state G- a word list assigned to each site- is given 
by the configuration G = {X(si)} Sie s = X(S). In the 
Naming Game (NG), we change the state of the network 
G as follows: 

1. In this step we randomly choose a site Sj G S as a 
"listener", with probability q(si). 

2. Next, choose a site Sj G Af Si with equal probabil- 
ity as a "speaker". The "speaker" will randomly 
choose a word W from its word list with equal prob- 
ability and send it to the " listener" . The latter will 
change its state X(si) into X'(si) by the following 
way: 



X'( Si ) 



= (x( Si )U{W} WiX{ Sl ) 
\{W} WeX( Sl ) 



We call step (2) the local transition step and the pro- 
cess (1),(2) together comprises a transition between G — 



{X(si),X(sj)} s .(zs\{si} and its neighboring state G' — 
{X'(si),X(sj)} s . e s\{si}- The global transition proba- 
bility P(G, G') of the NG depends on the probability 
for choosing s$ and the local transition probability from 
state Xo(si) to X'(sj) under neighborhood configuration 
X (Af Si ) is P(G, G') = q(s i )P(X (s l ),X'(si)\X (N' Si )). 
By randomly choosing an initial state Go G T s and 
applying steps (1) and (2) in each time period, we ob- 
tain a homogeneous Markov chain of the Naming Game 
{Go, Gi...G„...}, where the transition matrix from G„ to 
G„+i is given by the formula above. 

Next, we show that any invariant measure of this 
Markov chain is a linear combination of Igw where 1 
is the indicator function and Gw are the "single- name 
states" in which every site has only W in its word list. 
Let G 7fc for 7fe G T be the set of network states that sat- 
isfies the condition (J s -eS X{si) = j^. It is clear from the 
NG Markov chain above that single-name states are the 
only absorbing ones. 

To proceed, every network state G G G 7fc with \-fk\ > 2 is 
accessible to at least one absorbing state Gw , i-e. it will 
have a path {G, G%, Gi...G n , Gw} with nonzero probabil- 
ity p = (G,G 1 )P(G 1 ,G 2 )...P(G n ,G w ). Since this prob- 
ability p should be counted in the probability of leaving 
the state G and never returning to it, any state G which 
is not absorbing is transient. In an invariant measure 
of a Markov chain, the weight of any transient state is 
zero; thus, the only states with positive weights in an in- 
variant measure of the NG Markov chain are the single- 
name states Gw(W G Name). Since 1q w itself is al- 
ways an invariant measure, any invariant measure of the 
Naming Game is a linear combination of 1g w over au 
W G Name. In other words, the Naming Game Markov 
Chain, starting from any initial state will eventually go 
to a single-name state. 



III. NOISY NAMING GAMES (NNG) 

In view of the above Markov chain analysis, we need 
to perturb the original Naming Game to enhance its 
community-detection capabilities. It turns out that the 
class Noisy Naming Game (NNG) of arbitrarily small 
random perturbations of NG, given next, has the required 
key property of persistent multi-name steady states. In 
each step in the NNG Markov chain, the listener has a 
very small probability e to receive a noise word rather 
than the speaker's message where the noise word is a 
random word chosen uniformly from Name, and a prob- 
ability of 1 — e of proceeding as in the original NG. Then 
given two arbitrary network states, one can construct a 
path from one to the other with nonzero probability by 
forcing the network to receive a sequence of noise words 
as required. In this way, the NNG Markov chain is forced 
to be communicative and therefore ergodic, by arbitrarily 
small e. On the other hand, since e is arbitrarily small, 
the event of receiving a noise word rarely happens. So the 
noise will essentially change the long term behavior but 
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can be ignored in finite time. From the NNG Markov 
chain's ergodicity, its invariant measure is positive and 
unique up to a scalar multiple; hence the first main re- 
sult: 

Result : The NNG has a unique invariant probabil- 
ity distribution tt(G) > directly related to its global 
transition probability P(G,G'). Moreover, each multi- 
name state G m is recurrent and has positive weight, i.e., 
n(G m ) > 0. 

In both the original NG and the NNG, the lo- 
cal transition step (2) is given by Xo(sj) — > 
X'(si) with corresponding local transition probability 
P(Xo(si),X'(s i )\Xo(Jf lu ))lxw. t y=x ( l ff. t y By repeat- 
ing the NNG's local transition step (2) many times 
at fixed site Si 6 S with the same neighborhood 
state X (Af Si ), we generate a sequence of local states 
{Xo(si),X'(si), ...,X^(si)}, such that the marginal 
probability distribution of XW(sj) converges to a 1 ocal 
conditional probability distribution fi(X\Xo(J\f(si)) as n 
goes to infinity. Thus, f i (X\X (N'(si)) over all Si £ S 
is a well-defined limiting distribution of a local Markov 
chain with one site and fixed neighborhood state. 

To calculate the NNG's invariant distribution, we first 
construct a related Markov chain (NNGGS) verified to be 
a Gibbs Sampler for a Markov Random Field [7j. Keep- 
ing step (1) the same, the NNGGS replaces the local 
transition step (2) in the NNG by a "training step": 
(2') Xo(si) — > X*(si) which is the value of a random 
variable distributed according to the local conditional dis- 
tribution fi(X*\X (M(si)). 

The local training step (2') is thus equivalent to repeat- 
ing the NGG's local transition step (2) infinite number 
of times at fixed site with the same neighborhood state. 
Its global transition probability is given by 

P\G,G , ) = q(8 i )f i (X\a i )\Xo{Af{a i ))l xv> r ti)=Xo(A r mi ) 

Since only the local state at a single random site is 
changed according to a local specification, the NNGGS 
is indeed a Gibbs Sampler on a Markov Random Field 
(MRF) with local specification given by the family 
fi(X\X(Af(si)) for s t e S 0. Here, the NNGGS is more 
an indirect analytical method to derive the stationary 
distribution of the NNG in closed form than a method 
for numerical simulation of the NNG, for which there are 
better Markov chain Monte-Carlo methods 1151. 



IV. STATIONARY DISTRIBUTION OF THE 
GIBBS SAMPLER 

Using the Gibbs-Markov equivalence Q , we construct 
in terms of the NNGGS 's local specification, a Gibbs po- 
tential which is the stationary distribution it (G) of the 
NNGGS. In other words, under P'(G,G'), the NNGGS 
generates a Markov chain of realized states and a distri- 
bution, that converge to it (G). First, we need to assign a 
local state for each site. Then 0(E) for E C S denotes 



the configuration state where all sites in E are at the 
local state. The choice of the local state is not uniquc- 
for convenience we choose the whole word-list 7 = Name 
as the local state. For a clique L C S of the network 
graph, let x(L) be a certain configuration on L. Then 
the clique potential of this configuration is[l2l|: 

V(x(L)) =-J2 (-l) lL - El \n[F(x(E)\0(M(E)))}. 

EcL 

Here the summation is over al, subsets of the clique L, the 
neighborhood of a subset E = {si\i = 1, M} is defined 
by N{E) = \J StCE N{s t ) \E and F{x(E)\0(Af(E))) is 
given by: 



M 

n 



fi(x(s i )\X(s 1 , s. t _i), 0(s i+1 , s M ),0(Af(E))) 
i\ fMs i )\X(s 1 ,...,s l _ 1 ),0{s l+1 ,...,s M ),0(M(E))) 

Then the Gibbs potential is given by the sum (over all 
cliques in the network graph) of clique potentials: 

H{x) = V{x(L)). 

LCS 

The stationary distribution of the NNGGS is thus: 

tt'(x) = Z" 1 exp(-#(x)). 

where Z = J^g ex P( — H(G))) is the Gibbs partition 
function [7]. It remains to show that 7r is invariant under 
the NNG, that is, n = w. 

V. INVARIANT DISTRIBUTION OF THE NNG 

By the existence and uniqueness of the invariant dis- 
tribution of the NNG, any distribution that is invariant 
under the NNG must be equal to ir(G). Here, we ver- 
ify that the Gibbs distribution tt'(G) of the NNGGS 
is invariant under the NNG, thence the explicit con- 
struction of 7r(G) as the second main result. We will 
do this by showing and using detailed balance sev- 
eral times. Consider a chosen network state Go = 
{X (si), ...,X (si), ...X (sn)} and any neighboring net- 
work state Gik = {X(si), ...,X(si), ...X(sn)} where all 
but site Si have the same local states as Go and the dif- 
fering local state is the word list 7^ G T. By detailed 
balance of the NNGGS: 

7r'(Go)9(s i )/ i (7 fc |X (7V( Si )))=7r'(G ife ) g ( Si )/ i (Xo(s i )|Xo(Ar( Si ))). 

In terms of the global transition probability P(G, G') of 
the NNG, the change of measure under a single step of 
the NNG A7t'(G ) is 

[-P(G lfc , G )7r'(G 4fe ) - P(G , G lfc )^'(G )] 

i k 

i k 

-p(x Q (s t ),~/ k \x Q (N( Sl ))y(G lk ) 
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where the internal sum is taken over all word lists 7^. e T. 
Each term of the internal sum is zero by the NNG's local 
detailed balance at fixed site Sj because P(-, -\X (Af(si))) 
is the local transition probability of the NNG condi- 
tioned on fixed neighborhood state Xo(J\f(si)), and the 
quotient Tr'(Gik)/ir'{Go) has been shown to equal the 
quotient Mlk\Xo{N{si)))/MX ( Sl )\X (N-{ Si ))) where 
fi(-\X (Af(si))) are the values of the local specification 
at site s, of the NNGGS. In other words, the NNGGS' 
local specification, by construction (repeating the local 
transition step (2) in the NNG at fixed site Si, under 
the same neighborhood state Xo(Af(si))), is the station- 
ary distribution of an auxiliary Markov chain consisting 
of varying local states at the single site Si and a fixed 
boundary state at the sites in Af(si). We deduce that 
Att'(Gq) — 0, i.e. the Gibbs distribution ir (G) of the 
NNGGS is also the invariant distribution of the NNG. 



VI. CLIQUE POTENTIALS AND 
COMMUNITY STRUCTURE - ANALYSIS 

The first example concerns a 2-clique as in figure 1. 
With the local state chosen to be Name — {A, B}, and 
applying the above expression for F(x(E)\0(Af(E))) in 
terms of the local specification which are in turn calcu- 
lated using the stationary distribution of the correspond- 
ing local Markov chains (each with one site), we derived, 
in the vanishing noise limit, the probabilities tabulated 
below. This table gives a 2-clique potential for all pos- 
sible configurations with its neighborhood fixed at the 
local state, after using the natural symmetry in the 
problem. It shows this 2-clique has lowest energy when 
the two sites has the same single name. 




FIG. 1: 2-Cliquc 



x(L) 


F(x(L)\0(Af(L))) 


V(x(L)) 


A-AB 


1 





A-A 


2 


-0.6931 


A-B 


0.5 


0.6931 



A more interesting example in the following graph has 
two 3-cliques and a 2-clique that bridges them. Using the 
above procedure (and labeling the sites starting from the 
site on the bridge, e.g. B-A-AB means the site on the 
bridge has word list B), we calculate the clique potential 
for the 3-clique and tabulate its values below. 




FIG. 2: Example 2 



TABLE I: Clique Potential for 3-Cliquc 



x(L) 


F(x(L)\0(M(L))) 


V(x{L)) 


A-A-A 


15 


-2.7080 


AB-A-A 


3 


-1.0986 


B-A-A 


3/5 


-0.5108 



In this way, we calculate the Gibbs potential for each 
network state and show that multi-name states are or- 
dered by their Gibbs energy which we take to be a good 
measure of social tension in a particular state. After the 
single-name ground state, the state in figure 2 has the sec- 
ond lowest energy. Significantly, as the third main result 
and primary focus of this letter, the naming or coloring 
that reveals the underlying cliques in the graph also has 
the least energy amongst all multi-name states for given 
community (clique) structure in a network. Moreover, it 
is a local minimum, i.e. any one-step change of this con- 
figuration will increase its energy: H{AAA — BBB) — 
-4.7230 and H(AAA - ABB) = -2.8904. These lowest 
energy multi-name states are therefore the most likely 
ones to be found by simulated annealing of the NNG. 
This is consistent with the results on meta-stable states 
in [lj|, but have the advantage, in the NNG, of being 
persistent (recurrent), thence, enhancing its community- 
detection capability over the original NG. 

VII. STATIONARY PROBABILITY 
DISTRIBUTIONS OF THE NNG ON SMALL 
RANDOMIZED NETWORKS - SIMULATIONS 

We conclude with a brief discussion of the method used 
to compute the invariant probability distributions of the 
NNG on a family of small randomized networks consist- 
ing of 60 nodes - these network graphs are computer gen- 
erated. Using a total word list of cardinality two, we show 
that the NNG efficiently finds the 40:20 splits into com- 
munity structures where two of the communities or sub- 
graphs (by design of about 20 nodes each) have one name 
/ color and the remaining subgraph of 20 nodes have the 
second name / color In addition, it also produces an in- 
variant measure which, after discounting highest values 
achieved naturally at the total consensus or single-name 
states at the and 60-nodes peaks, gives relative weights 
on the meta-stable states that provide information on 
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community structure. For example, we refer to the sym- 
metric 20 and 40 - nodes peaks in the following figures. 
About 100 million steps of the NNG was used to produce 
the invariant measure depicted. 




FIG. 3: Blue triangle for A, red circle for B and black square 
for AB 




10 20 30 40 50 

Number ol nodes in group A (blue triangle) 



FIG. 4: Invariant measure for NNG 



[Gibbs potential from local specification] 



1. Local specification 

In general, it's not easy to calculate the invariant mea- 
sure of a local Markov chain, namely, its local specifica- 
tion fi(X(si)\X(Af(si))). However for small cliques and 
when there are only a small number of allowed words 
such as two in the three local states (A, B, AB) in the 
above examples, the transition in the Markov chain is 
totally decided by the message that Si received from its 
neighbors. Let PA(si),PB{si) be the probability for Sj to 
receive A, B respectively. Their values can be calculated 
from the neighborhood condition X(J\f(sA); more pre- 
cisely the number of neighbors in state A, B and AB, are 



given by: 

PA{Si) - 2 + U - eJ #{ S3 ga^( Si )} 

, , e M ^ #{s j efS{s i )\X(s j )=B} + %#{s j eAr( Si )\X(s j )=AB} 

PB{Si) - 2 + (1 ~ e) #{s 3 eM{s t )} 

Given the noise level e is arbitrarily small, we can take 
the limit e — » 0: 

, , _ #{ Sj eN-(s t )\X(s J )=A} + ^#{s J eM(s t )\X(s j )=AB} 



, s _ #{s J £Af( S ,)|X( ;ij )^i3} + l#{ S3 -g7V( s ,)|X( ;ij )=Aa} 
PB( s i) - #{ Sj SA^( Sl )} 

Leaving aside the transitions involving a single-name 
state, that is, as long as one only considers the transi- 
tion between two multi-name states, taking this limit is 
valid. From detailed balance we get 

MA\x(M( Sl )))p B (s l ) = fi{AB\x(N(si)))pA(si) 



f l (AB\x(Af(s l ))) PB (s l ) = MB\x(Af( Sl ))) PA (s l ) 
from which we find the local specification: 
f i (A\x(M(s i )))=p A (s i )/Z l (s i ) 

f i {B\x{M{s i )))=p%{s i )/Z l (s i ) 

fi{AB\x{M{ Si ))) = pa{s 1 )pb(s 1 )/Z 1 {s 1 ) 

Zi{s l ) = p 2 A (si) +p 2 B {si) +PA(Si)pB(Si) 

2. Properties of F(x(E)\0(Af(E))) 

By definition, 
F(x(E)\0{Af(E))) is given by: 

rrM fi(x(s i )\X(s 1 ,...,s i - 1 ),0(si + 1 ,...,s M ),0W(E))) 

1L=i / i (o(s i )|x( Sl ,...,s i _i),o(si + i,...,s M ),o(Ar(£;))) 
Several facts will help us to simplify this formula. 

1. In each factor of the above product, the condi- 
tionals in the numerator and denominator are 
the same. So if x(si) = 0(sj), the factor is 1. 
Therefore only the sites not in state counts in 
the product. A direct consequence of this fact is 
F(0(E)\0(M(E))) =1. 

2. For any 1-Clique in this example: 
F(AB(si)\0(Af(si))) — 1 according to the last para- 
graph, and 

f(a( \WKrt \w MA\0(Af(E))) PA ( Sl ) 
fi{AB\0(N{E))) pb{s 1 ) 

Similarly F(B(s i )\0(M(s i ))) = I, thus, the func- 
tion F on any 1-Clique has value. 
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3. Consider the clique E = {si, Sm} and its From A-B symmetry and graph symmetry, we can get 
two distinguished configurations which only dif- other function values from these results, 
fcr at one site s m . By relabeling the sites we 
can write the two configurations as X(E) = 
{x(si), ...,x(s m -i),x(s m ),0(s m+ i), ...,0(s M )} and 
Y(E) = {x(si), ...,x(s m -i),y(s m ),0(s m+ i), ...,0(sm)}- 
In the ratio F(Y(E)\0(J\f(E)))/F(X(E)\0(JV(E))), 
the factors for states do not appear and the 
factors for s\, s m _i cancel. So we have the 
recursive relationship: 

F(Y(E)\0(Af(E))) _ f m (y(s m )\X(s 1 ,..., Sm . 1 ),0(s m+u ...,8M)M^(E))) 
F(X(E)\0(N(E))) - f m (x(s m )\X(s 1 ,...,s m - 1 ),0^rr,+i,-,s M )M^(E)))- 

Using this relationship we calculate the function 

value of F recursively. 5 - Gibbs Potential 



3. 2-Clique 

In the example above, the network have one 2-Cliquc 
and two similar 3-Cliques. We need to calculate the func- 
tion value of F for certain configurations on the given 
Cliques. In the following calculations, pA{si) and Pb{s%) 
are always calculated according to the current neighbor 
configuration. For the 2-Clique shown in Figure 1, we 
have 

F(AB - AB\0(M(E))) 
= F(0(E)\0(M(E))) = 1 
F(A — AB\0(Af(E))) 

_ / 1 (A|0( S 2),0(AT(g))) 
- f 1 (AB\0(s 2 ),0(-^(E))) 
_ Pa(si) _ ^ 

F(A — A\0(Af(E))) 

= F {A-AB\omm^^mk 

_ Pa(s 2 ) 



Pb(s 2 ) 

F(A - B\0(M(E))) 

F(A - AB\0(Af(E))) MB\M'i),omE))) 

Pb(s 2 ) _ 1 
Pa(s 2 ) 2' 



> f 2 (AB\A(s 1 ) > 0(X(E))) 



The 3-Clique in Figure 2 has three 2-Cliques and three 
1-Cliques embedded in it. Adding them, calling the sum 
"net clique potential" for the 3-Clique "L", and using the 
fact that the terms for 2-Cliques cancel, the result turns 
out to be: V net (L) = - ]n(F{L\Af{L))). 
Finally, the total Gibbs potential includes the clique po- 
tential of the 2-Cliquc and the net clique potential of two 
3-Cliques, 



4. 3-Clique 

For the 3-Clique on the left in Figure 2, 

F(AB - AB- AB\0(Af(E))) = F(0(E)\0(Af(E))) = 1 
F(AB- AB - A)\0(Af(E))) 

f 3 (A\0( Sl , S2 ),0(Af(E))) _ pa(s 3 ) _ 1 
_ f 3 (AB\0( S1 , S2 ),0(^(E))) pb(s 3 ) 



F(AB 
F(AB 

Pa{s 2 ) 
Pb{s 2 ) 



A 



A)\0(M(E))) 

AR A Inr Kf< F\\\ h{A\Q(si),A(s 3 )fi{,N{E))) 
At> - A\V(A {&))) MA B\0( Sl ),A(s 3 ),0(^(E))) 



= 3 



F(A -A - A)\Q(Af(E))) 

F(AB - A - A|O(^(£0)) 3^feSjgga 



3 £A(£l) = lg 
Pb(«i) 



F(B - A- A)\0(Af(E))) 
F(AB — A — A\Q(J\f(E))) 

^ Pb(si) _ 3 
PA (si) 5 



f 1 (B\A(s 2 ,s 3 )M^(E))) 
f 1 (AB\A(s 2 ,s 3 ),Q(^(E))) 



H(AAA - BBB) 

V(A - B) + V net (A - A- A) + V net {B - B - B) 
-{ln(F{A - B\0(AT(E)))) +2ln{F(A -A- A\Q(Af(E)))} 
-4.7230 



H{AAA - ABB) 
= V(A -A) + V net (A — A — A) + V net (A — B — B) 
= -{ln(F(A - A\0(Af(E)))) + ln(F(A -A- A\0(Af(E))) 

+ \n(F(A -B- B\0{M(E)))} 
= -2.8904 
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